W1: Intro to Computing

Welcome!

Introductions

  • Who am I?
  • What is DaSL?
  • Who are you?

    • Name, pronouns, group you work in

    • What you want to get out of the class

    • Favorite spring activity

Goals of the course

  • Fundamental concepts in programming languages: How do programs run, and how do we solve problems effectively using functions and data structures?
  • Data science fundamentals: How do you translate your scientific question to a data wrangling problem and answer it?

    Data science workflow

Culture of the course

  • Learning on the job is challenging
    • I will move at learner’s pace
    • Teach not for mastery, but teach for empowerment to learn effectively.
  • Various personal goals and applications
    • Curate content towards end of the course
  • Respect Code of Conduct

Format of the course

  • 6 classes: April 17, 24, May 1, 8, 15, 22
  • Streamed online, recordings will be available.
  • 1-2 hour exercises after each session are strongly encouraged as they provide practice.

  • Optional time to work on exercises together on Fridays Noon - 1pm PT.

  • Online discussion via Slack.

Content of the course

  1. Intro to Computing
  1. Data structures
  1. Data wrangling 1
  1. Data wrangling 2
  1. Data visualization
  1. Loading your own data in, celebratory lunch!!

What is a computer program?

  • A sequence of instructions to manipulate data for the computer to execute.
  • A series of translations: English <-> Programming Code for Interpreter <-> Machine Code for Central Processing Unit (CPU)

We will focus on English <-> Programming Code for R Interpreter in this class.

Another way of putting it: How we organize ideas <-> Instructing a computer to do something.

Setting up Posit Cloud and trying out your first analysis!

What’s the connection between English <-> Programming Code for R Interpreter?

Break

A pre-course survey:

https://forms.gle/Hr59ZbAan1JTumCa7

Grammar Structure 1: Evaluation of Expressions

  • Expressions are built out of operations or functions.
  • Operations and functions take in data types and return another data type.
  • We can combine multiple expressions together to form more complex expressions: an expression can have other expressions nested inside it.

Examples

18 + 21
[1] 39
max(18, 21)
[1] 21
max(18 + 21, 65)
[1] 65
18 + (21 + 65)
[1] 104
nchar("ATCG")
[1] 4

Function machine from algebra class

Function machine from algebra class.

Operations are just functions. We could have written:

sum(18, 21)
[1] 39
sum(18, sum(21, 65))
[1] 104

Data types

  • Numeric: 18, -21, 65, 1.25

  • Character: “ATCG”, “Whatever”, “948-293-0000”

  • Logical: TRUE, FALSE

Grammar Structure 2: Storing data types in the environment

To build up a computer program, we need to store our returned data type from our expression somewhere for downstream use.

x = 18 + 21

Execution rule for variable assignment

Evaluate the expression to the right of =.

Bind variable to the left of = to the resulting value.

The variable is stored in the environment.

<- is okay too!

Downstream

Look, now x can be reused downstream:

x - 2
[1] 37
y = x * 2
y
[1] 78

Grammar Structure 3: Evaluation of Functions

A function has a function name, arguments, and returns a data type.

Execution rule for functions:

Evaluate the function by its arguments, and if the arguments contain expressions, evaluate those expressions first.

The output of functions is called the returned value.

sqrt(nchar("hello"))
[1] 2.236068
(nchar("hello") + 4) * 2
[1] 18

A programming language has following features:

  • Grammar structure to construct expressions
  • Combining expressions to create more complex expressions
  • Encapsulate complex expressions via functions to create modular and reusable tasks
  • Encapsulate complex data via data structures to allow efficient manipulation of data

Tips on writing your first code

Computer = powerful + stupid

Even the smallest spelling and formatting changes will cause unexpected output and errors!

  • Write incrementally, test often
  • Check your assumptions, especially using new functions, operations, and new data types.
  • Live environments are great for testing, but not great for reproducibility.
  • Ask for help!

That’s all!

Maybe see you Friday Noon - 1pm PT to practice together!